Skip to content

[GLUTEN-11402][VL] Fix decimal partition key serialization to preserve scale#11618

Merged
zhouyuan merged 1 commit intoapache:mainfrom
acvictor:acvictor/decimalPartition
Mar 5, 2026
Merged

[GLUTEN-11402][VL] Fix decimal partition key serialization to preserve scale#11618
zhouyuan merged 1 commit intoapache:mainfrom
acvictor:acvictor/decimalPartition

Conversation

@acvictor
Copy link
Contributor

@acvictor acvictor commented Feb 15, 2026

What changes are proposed in this pull request?

This PR fixes decimal partition value serialization by replacing toJavaBigInteger.toString with toJavaBigDecimal.unscaledValue().toString, removes fallback guard that was added by #11518 and adds additional test cases to SQLQuerySuite covering small decimals, zero-scale decimals, negative values, and multi-partition pruning.

How was this patch tested?

Existing UTs added in #11518 + extended Incorrect decimal casting for partition read test

Was this patch authored or co-authored using generative AI tooling?

No

Related issue: #11402

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Feb 15, 2026
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@acvictor acvictor force-pushed the acvictor/decimalPartition branch from 8d961b0 to d0172fa Compare February 15, 2026 15:22
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@acvictor acvictor force-pushed the acvictor/decimalPartition branch from d0172fa to f97d6e1 Compare February 15, 2026 16:46
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@acvictor acvictor marked this pull request as ready for review February 15, 2026 18:33
@acvictor
Copy link
Contributor Author

acvictor commented Feb 15, 2026

@baibaichen @zhouyuan can you please review? Thanks!

@acvictor
Copy link
Contributor Author

cc @Surbhi-Vijay

@acvictor
Copy link
Contributor Author

@zhouyuan ping on a review for this, thanks 😊

@zhouyuan
Copy link
Member

zhouyuan commented Mar 3, 2026

@acvictor Thanks for the fix. The code looks good. However in the log, it seems there are still some fallback on scan reported, is this expected?
https://github.com/apache/incubator-gluten/actions/runs/22039361874/job/63677680172?pr=11618#step:8:8427

@baibaichen baibaichen requested review from beliefer and zhouyuan and removed request for zhouyuan March 3, 2026 11:45
@baibaichen baibaichen force-pushed the acvictor/decimalPartition branch from f97d6e1 to a15475e Compare March 3, 2026 11:47
@github-actions
Copy link

github-actions bot commented Mar 3, 2026

Run Gluten Clickhouse CI on x86

@acvictor
Copy link
Contributor Author

acvictor commented Mar 3, 2026

@acvictor Thanks for the fix. The code looks good. However in the log, it seems there are still some fallback on scan reported, is this expected? https://github.com/apache/incubator-gluten/actions/runs/22039361874/job/63677680172?pr=11618#step:8:8427

@zhouyuan this is expected.

Baseline

26/03/02 15:03:48 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=34], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/03/02 15:03:48 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=34], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260302 15:03:48.568771 27742 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2464, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/03/02 15:03:48 WARN GlutenFallbackReporter: Validation failed for plan: Scan parquet spark_catalog.default.dynparttest2, due to: 
 - Unsupported decimal partition column in native scan.
26/03/02 15:03:48 WARN GlutenFallbackReporter: Validation failed for plan: ColumnarToRow, due to: 
 - Unsupported decimal partition column in native scan.
26/03/02 15:03:48 WARN GlutenFallbackReporter: Validation failed for plan: Scan parquet spark_catalog.default.dynparttest2[QueryId=36], due to: 
 - Unsupported decimal partition column in native scan.
26/03/02 15:03:48 WARN GlutenFallbackReporter: Validation failed for plan: ColumnarToRow[QueryId=36], due to: 
 - Unsupported decimal partition column in native scan.
- Incorrect decimal casting for partition read

This PR

26/02/15 17:07:03 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=34], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:03 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=34], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260215 17:07:03.596613 27433 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2455, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/02/15 17:07:03 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=40], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:03 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=40], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260215 17:07:04.033113 27433 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2455, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/02/15 17:07:04 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=46], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:04 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=46], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260215 17:07:04.451627 27433 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2455, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/02/15 17:07:04 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=52], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:04 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=52], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260215 17:07:04.846966 27433 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2455, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/02/15 17:07:05 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=58], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:05 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=58], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260215 17:07:05.233858 27433 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2455, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/02/15 17:07:05 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=59], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:05 WARN GlutenFallbackReporter: Validation failed for plan: Project[QueryId=59], due to: 
 - Validation failed with exception from: ProjectExecTransformer, reason: CheckOverflowInTableInsert is used in ANSI mode, but Gluten does not support ANSI mode.
E20260215 17:07:05.426112 27433 Exceptions.h:53] Line: /work/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:2455, Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: INVALID_STATE
26/02/15 17:07:05 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=61], due to: [FallbackByBackendSettings] Validation failed on node Exchange
26/02/15 17:07:05 WARN GlutenFallbackReporter: Validation failed for plan: Exchange[QueryId=62], due to: [FallbackByBackendSettings] Validation failed on node Exchange
- Incorrect decimal casting for partition read

The Exchange/Project fallbacks with CheckOverflowInTableInsert are pre-existing on the INSERT path and the baseline also has this. This PR has more instances because I extended the test to go from 1 INSERT to 6 INSERTs to cover additional decimal scenarios. The logs do show an improvement from the baseline, because Scan parquet spark_catalog.default.dynparttest2 was previously falling back with "Unsupported decimal partition column in native scan." but in this PR, that scan fallback is eliminated.

DateFormatter.apply().format(pv.asInstanceOf[Integer])
case _: DecimalType =>
pv.asInstanceOf[Decimal].toJavaBigInteger.toString
pv.asInstanceOf[Decimal].toJavaBigDecimal.unscaledValue().toString
Copy link
Contributor

@beliefer beliefer Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why decimal partition keys are not supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that it's unsupported but rather results in incorrect casting (see #11618). The change is needed because Decimal.toJavaBigInteger truncates the fractional part, producing an incorrect unscaled value. For example, a Decimal("100.1") with scale=1 would serialize as "100" (the truncated BigInteger) instead of "1001" (the correct unscaled representation). This causes Velox reader to misinterpret decimal partition values, returning wrong query results.

Copy link
Contributor

@beliefer beliefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests passed.

@baibaichen baibaichen force-pushed the acvictor/decimalPartition branch from a15475e to 55820f9 Compare March 4, 2026 13:32
@github-actions
Copy link

github-actions bot commented Mar 4, 2026

Run Gluten Clickhouse CI on x86

@acvictor
Copy link
Contributor Author

acvictor commented Mar 5, 2026

@zhouyuan does this change look good to you?

Copy link
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks for the fix!

@zhouyuan zhouyuan merged commit a96acea into apache:main Mar 5, 2026
61 of 62 checks passed
@acvictor acvictor deleted the acvictor/decimalPartition branch March 5, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants